Wide Area Network Monitoring System for Hep Experiments at Fermilab

نویسندگان

  • Maxim Grigoriev
  • Les Cottrell
  • Connie Logg
چکیده

Large, distributed High Energy Physics (HEP) collaborations, such as D0, CDF and US-CMS, depend on stable and robust network paths between major world research centres. The evolving emphasis on data and compute Grids increases the reliance on network performance. Fermilab's experimental groups and network support personnel identified a critical need for WAN monitoring to ensure the quality and efficient utilization of such network paths. This has led to the development of the Network Monitoring system we will present in this paper. The system evolved from the IEPM-BW project, started at SLAC three years ago. At Fermilab this system has developed into a fully functional infrastructure with bi-directional active network probes and path characterizations. It is based on the Iperf achievable throughput tool, Ping and Synack to test ICMP/TCP connectivity. It uses Pipechar and Traceroute to test, compare and report hop-by-hop network path characterization. It also measures real file transfer performance by BBFTP and GridFTP. The Monitoring system has an extensive web-interface and all the data is available through standalone SOAP web services or by a MonaLISA client. Also in this paper we will present a case study of network path asymmetry and abnormal performance between FNAL and SDSC, which was discovered and resolved by utilizing the Network Monitoring system. NETWORKING AT FERMILAB

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neuro-Fuzzy Based Algorithm for Online Dynamic Voltage Stability Status Prediction Using Wide-Area Phasor Measurements

In this paper, a novel neuro-fuzzy based method combined with a feature selection technique is proposed for online dynamic voltage stability status prediction of power system. This technique uses synchronized phasors measured by phasor measurement units (PMUs) in a wide-area measurement system. In order to minimize the number of neuro-fuzzy inputs, training time and complication of neuro-fuzzy ...

متن کامل

End-to-End Network/Application Performance Troubleshooting Methodology

The computing models for HEP experiments are globally distributed and grid-based. Obstacles to good network performance arise from many causes and can be a major impediment to the success of the computing models for HEP experiments. Factors that affect overall network/application performance exist on the hosts themselves (application software, operating system, hardware), in the local area netw...

متن کامل

G-NetMon: A GPU-accelerated Network Performance Monitoring System for Large Scale Scientifc Collaborations

At Fermilab, we have prototyped a GPU-accelerated network performance monitoring system, called G-NetMon, to support large-scale scientific collaborations. Our system exploits the data parallelism that exists within network flow data to provide fast analysis of bulk data movement between Fermilab and collaboration sites. Experiments demonstrate that our GNetMon can rapidly detect sub-optimal bu...

متن کامل

A Fast Voltage Collapse Detection and Prevention Based on Wide Area Monitoring and Control

Voltage stability is one of the most important factors in maintaining reliable operation of power systems. When a disturbance occurs in the power system, it usually causes instabilities and sometimes leads to voltage collapse (VC). To avoid such problems, a novel approach called Vector Analysis (VA) is proposed that exploits a new instability detection index to provide wide area voltage stabili...

متن کامل

Developing a New Decision Support System to Manage Human Reliability based on HEART Method

Human performance and reliability monitoring have become the main issue for many industries since human error ratios cannot be mitigated to the zero level and many accidents, malfunctions, and quality defects are happening due to the human in production systems. Since the human resources implement a different range of tasks, the calculation of human error probability (HEP) is complicated, and s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004